Accurate Robust and Efficient Error Estimation for Decision Trees

نویسنده

  • Lixin Fan
چکیده

This paper illustrates a novel approach to the estimation of generalization error of decision tree classifiers. We set out the study of decision tree errors in the context of consistency analysis theory, which proved that the Bayes error can be achieved only if when the number of data samples thrown into each leaf node goes to infinity. For the more challenging and practical case where the sample size is finite or small, a novel sampling error term is introduced in this paper to cope with the small sample problem effectively and efficiently. Extensive experimental results show that the proposed error estimate is superior to the well-known K-fold cross validation methods in terms of robustness and accuracy. Moreover it is orders of magnitudes more efficient than cross validation methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust state estimation in power systems using pre-filtering measurement data

State estimation is the foundation of any control and decision making in power networks. The first requirement for a secure network is a precise and safe state estimator in order to make decisions based on accurate knowledge of the network status. This paper introduces a new estimator which is able to detect bad data with few calculations without need for repetitions and estimation residual cal...

متن کامل

Bayes, E-Bayes and Robust Bayes Premium Estimation and Prediction under the Squared Log Error Loss Function

In risk analysis based on Bayesian framework, premium calculation requires specification of a prior distribution for the risk parameter in the heterogeneous portfolio. When the prior knowledge is vague, the E-Bayesian and robust Bayesian analysis can be used to handle the uncertainty in specifying the prior distribution by considering a class of priors instead of a single prior. In th...

متن کامل

Investigation of the Allometric Models in Estimation of Poplar (Populus deltoides) Height

One of the most important issues in forest biometrics is the use of allometric functions to estimate the tree height by using diameter-height models. Measuring the total height of trees is usually a complex and time-consuming process. In allometric functions, the diameter is measured directly but the height of the tree is an estimate of an allometric model, which will be more accurate if the cr...

متن کامل

A Robust Distributed Estimation Algorithm under Alpha-Stable Noise Condition

Robust adaptive estimation of unknown parameter has been an important issue in recent years for reliable operation in the distributed networks. The conventional adaptive estimation algorithms that rely on mean square error (MSE) criterion exhibit good performance in the presence of Gaussian noise, but their performance drastically decreases under impulsive noise. In this paper, we propose a rob...

متن کامل

Robust Identification of Smart Foam Using Set Mem-bership Estimation in A Model Error Modeling Frame-work

The aim of this paper is robust identification of smart foam, as an electroacoustic transducer, considering unmodeled dynamics due to nonlinearities in behaviour at low frequencies and measurement noise at high frequencies as existent uncertainties. Set membership estimation combined with model error modelling technique is used where the approach is based on worst case scenario with unknown but...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016